Why Does Unsupervised Pre-training Help Deep Learning?

نویسندگان

  • Dumitru Erhan
  • Yoshua Bengio
  • Aaron C. Courville
  • Pierre-Antoine Manzagol
  • Pascal Vincent
  • Samy Bengio
چکیده

Much recent research has been devoted to learning algorithms for deep architectures such as Deep Belief Networks and stacks of auto-encoder variants, with impressive results obtained in several areas, mostly on vision and language data sets. The best results obtained on supervised learning tasks involve an unsupervised learning component, usually in an unsupervised pre-training phase. Even though these new algorithms have enabled training deep models, many questions remain as to the nature of this difficult learning problem. The main question investigated here is the following: how does unsupervised pre-training work? Answering this questions is important if learning in deep architectures is to be further improved. We propose several explanatory hypotheses and test them through extensive simulations. We empirically show the influence of pre-training with respect to architecture depth, model capacity, and number of training examples. The experiments confirm and clarify the advantage of unsupervised pre-training. The results suggest that unsupervised pretraining guides the learning towards basins of attraction of minima that support better generalization from the training data set; the evidence from these results supports a regularization explanation for the effect of pre-training.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Summary and discussion of: “Why Does Unsupervised Pre-training Help Deep Learning?”

Before getting into how unsupervised pre-training improves the performance of deep architecture, let’s first look into some basics. Let’s start with logistic regression, which is one of the first models for classification that is taught in machine learning. Logistic classification deals with the supervised learning problem of learning a mapping F : X → Y given a set of training points X = {x1 ....

متن کامل

Unsupervised Feature Learning With Symmetrically Connected Convolutional Denoising Auto-encoders

Unsupervised pre-training was a critical technique for training deep neural networks years ago. With sufficient labeled data and modern training techniques, it is possible to train very deep neural networks from scratch in a purely supervised manner nowadays. However, unlabeled data is easier to obtain and usually of very large scale. How to make use of them better to help supervised learning i...

متن کامل

Deep Learning of Representations for Unsupervised and Transfer Learning

Deep learning algorithms seek to exploit the unknown structure in the input distribution in order to discover good representations, often at multiple levels, with higher-level learned features defined in terms of lower-level features. The objective is to make these higherlevel representations more abstract, with their individual features more invariant to most of the variations that are typical...

متن کامل

Unsupervised Pre-training With Seq2Seq Reconstruction Loss for Deep Relation Extraction Models

Relation extraction models based on deep learning have been attracting a lot of attention recently. Little research is carried out to reduce their need of labeled training data. In this work, we propose an unsupervised pre-training method based on the sequence-to-sequence model for deep relation extraction models. The pre-trained models need only half or even less training data to achieve equiv...

متن کامل

The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training

Whereas theoretical work suggests that deep architectures might be more efficient at representing highly-varying functions, training deep architectures was unsuccessful until the recent advent of algorithms based on unsupervised pretraining. Even though these new algorithms have enabled training deep models, many questions remain as to the nature of this difficult learning problem. Answering th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010